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Abstract: We consider the problem of matching model and sensory data features in the 
presence of geometric uncertainty, for the purpose of object localization and identification. 
The problem is to construct sets of model feature and sensory data feature pairs that are 
geometrically consistent given that there is uncertainty in the geometry of the sensory data 
features. If there is no geometric uncertainty, polynomial- time algorithms are possible for 
feature matching, yet these approaches can fail when there is uncertainty in the geometry 
of data features. Existing matching and recognition techniques which account for the ge- 
ometric uncertainty in features either cannot guarantee finding a correct solution, or can 
construct geometrically consistent sets of feature pairs yet have worst case exponential com- 
plexity in terms of the number of features. The major new contribution of this work is to 
demonstrate a polynomial- time algorithm for constructing sets of geometrically consistent 
feature pairs given uncertainty in the geometry of the data features. We show that under 
a certain model of geometric uncertainty the feature matching problem in the presence of 
uncertainty is of polynomial complexity. This has important theoretical implications by 
demonstrating an upper bound on the complexity of the matching problem, and by offering 
insight into the nature of the matching problem itself. These insights prove useful in the 
solution to the matching problem in higher dimensional cases as well, such as matching 
three-dimensional models to either two or three-dimensional sensory data. The approach is 
based on an analysis of the space of feasible transformation parameters. This paper outlines 
the mathematical basis for the method, and describes the implementation of an algorithm 
for the procedure. Experiments demonstrating the method are reported. 
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1 Introduction 

A common approach to model-based object recognition is the hypothesis and verification 
paradigm. The pose in the image of a known object is hypothesized then evaluated and 
verified to localize the object in the image. The hypothesis stage often consists of deriving a 
set of geometric features from both the modeled object and the image data and determining 
correspondences between pairs of image and model features. From these correspondences a 
rigid transformation on the model features (and the model itself) is computed aligning the 
model geometrically with an hypothesized instance of the object in the image. This forms 
the object pose hypothesis which can then be evaluated and verified based on comparing 
the transformed model with the underlying image data. 

The central problem in this method of hypothesis construction is determining the sets 
of corresponding image and model features. This feature matching is difficult in the do- 
main of object localization for three main reasons. Due to object occlusions in the scene, 
some model features may have no corresponding image feature; some features are missing. 
Because there are other objects in the image, some image features will not correspond to 
any model features; some features are spurious. Finally, due to inaccuracies in sensing 
and feature extraction, there is some uncertainty in the geometry of image features; their 
measured positions or orientations may deviate from the correct values. These three fac- 
tors, missing, spurious, and distorted image features conspire to make determining feature 
correspondences difficult. 

In this paper we focus on the central problem of image and model feature matching 
in the presence of missing, spurious, and distorted features, for the construction of pose 
hypothesis. In particular we define a model of the geometrical uncertainty of image features, 
and devise a tractable algorithm for determining all geometrically consistent sets of feature 
correspondences given the uncertainty tolerances. The paper is organized into six main 
sections. Section 2 outlines the formal basis of the approach, and section 3 outlines details of 
the computation, and presents an algorithm for the construction of feature correspondences, 
and pose hypothesis. The final sections present some experimental results, extensions to 
this work, related work, and the conclusions of the paper. 

2 The Idea: Approximate Matching 

This paper considers the case of localization of 2D objects given 2D sensory data such as 
grey-level images. The problem at hand is to construct sets of feature correspondences from 
which to form pose hypotheses. We first define what constitutes a feature: the primary 
attributes of the features we use for pose hypothesis are a definite position and orientation 
in the plane. For the first part of this development we make use of point features, whose 
position characterizes the position of a point on the boundary contour of an object, and 
whose orientation may, for example, characterize the orientation of the contour normal at 
the point, if this is stable. We later extend the method to use line segments. 



Or 



Mm vm * *VaMim 



TtMMtofpOMibl* 



\6. 



TtwiangaofpoMM* 
actual orientation* 



Figure 1: Shown are uncertainty bounds in position and orientation. Any position within a 
distance e of the measured position is possible, as is any orientation within 6 of the measured 
orientation. 



2.1 The Basic Elements 

This section presents the basic ideas involved in our approach to feature matching. 

2.1.1 Bounded Uncertainty 

A key assumption we make is that the uncertainty in the actual position and orientation 
of an image feature is bounded. This is a common assumption[8][4][3][7]. We consider two 
independent bounds on the positional and orientational uncertainty. Figure 1 illustrates 
these uncertainty bounds. The position of an image feature represents the measured po- 
sition of the contour point. We assume the true position may deviate from the measured 
position by a maximum distance of e, thus the real position falls within a circle of radius 
€ centered at the measured position. We assume the true orientation may deviate from 
the measured orientation by a maximum angle of 6, thus the real orientation falls within a 
range of orientations of length 26 centered at the measured orientation. 

2.1.2 Some Notation 

We characterize a transformation by three parameters 0, u, and v, where <f> is the angle 
of a rotation about the origin and u and v are translations in the x and y directions, 
respectively. In the 2D domain all positions and orientations can be represented by vectors 
t)G B 2 . Note that the vector space C of complex numbers is isomorphic to the vector 
space M 2 . For much of the analysis it will be convenient to use complex numbers to 
represent positions and orientations. With this representation, rotation about the origin by 
an angle <f> corresponds to multiplication by the complex exponential e i<f> . If a = (a x ,a y ) r 
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Figure 2: Model features on the left, image features on the right. A geometrically consistent 
match set is shown in figure 3. The positions are indicated by a small dot, and the orientations by 
a short line segment. 

as a complex number we have a <-> a = a x + ia y . Let the position of each model and image 
feature be represented by p m and p<j, respectively. Similarly, the orientation of the features 
are given by m and d . We will denote a feature match as an ordered pair (m,d), where 
m = (p m ,0 m ) and d = (pd,0j). Finally, let T be the group of translations, R be the 
group of rotations about the origin, and the transformation group be TPS = T <g> R, their 
composition. 1 We will let T4 G R be the operator for rotation by <f>, and T t G T be the 
operator for translation by t = u + iv. A transformation T = T t o T<f, is the composition of 
a rotation and a translation, applied in that order. 

2.2 Geometrically Consistent Match-Sets 

We will use the term feature match or match to describe a single pair of a model 
feature and an image feature. An important fundamental definition is that of a feasible 
transformation for a particular feature match, which is closely linked to the notion of a 
geometrically consistent set of feature matches which we will introduce in this section. Let 
T be an arbitrary transformation, and let (pJnJ^m) = T[(p m ,0 m )] represent a transformed 
model feature. For most T G TPS we have |pj„ - p d \ > e, or \0' m - $ d \ > 6. That is, 
after transformation the model and image features will not be even approximately aligned 
geometrically. But for some transformations the two matched features will be approximately 
aligned geometrically. Let F m ,<j C TPS be the set 

F m4 = {Te TPS : |p^ - p d | < e,\e' m - B d \ < 6,(p' m ,0' m ) = T[( Pm ,0 m )]}, 

that is, F mjd is the set of transformations on the model feature m which leave it within 
e in position, and 6 in orientation of the data feature d. We define the set of feasible 
transformations for the feature match (m,d) to be the set F mf( f 



1 TPS = T ® R is the semi-direct product of the two subgroups. 
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Figure 3: The model features transformed and plotted with the image features. The image 
features have uncertainty circles of radius e around them. The model and image feature 
pairs which fall within e and 6 of one another form a geometrically consistent match set. 
The model and image features alone are shown in figure 2. In this picture, the model 
features from figure 2 have been rotated by about 170 degrees counterclockwise, translated 
and plotted over the image features. 




Figure 4: Left: the model and image features for a particular rotation, <f> , of the model 
feature, shown in position space. The model feature's orbit under rotation is shown as a 
dotted circle. Right: the match-disc of feasible translations in translation space for this 
rotation of the model feature. Any translation t = u + iv within this disc will leave the 
rotated model feature within e of the image feature. The dotted circle is the orbit of the 
match-disc as a function of the rotation of the model feature. Figure 7 shows this same 
rotation along with two nearby rotations and the effects on the match-disc. 
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Figure 5: Left: Position space, showing the model features and the image features 
near the correct rotation of the model features; both are plotted together. Right: The 
match circles from all possible pairings of the model and image features. The dotted circles 
are not valid match- discs for the particular rotation because the difference in orientation 
for the feature match associated with these circles are are greater than 6. Note there are 
translations simultaneously aligning three matches. 



We will use the term match-set to describe a set of model and image feature matches. 
Note that a given image or model feature can appear in several different matches in the 
same match-set, thus the mapping is not one-to-one. Let M = {{rrii,dj)} be a set of feature 
matches. Such a match-set is called geometrically consistent if 
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that is if there exists some transformation which is geometrically consistent for all (m;, dj) G M. 
Intuitively, the overlapping sets F m .,^ for all (m^dj) G {m;} x {dj}, divide TPS into 
equivalence classes where each class is associated with a different geometrically consistent 
match-set. More formally, let T G TPS be a transform, and define <p(T) to be the set 
{(mi,dj);T G F™^} 2 . The function <p(T) partitions TPS forming equivalence classes E k 
where TPS = U/J E k and T = T' <=> y?(T) = <p(T'). The set {E k } is the set of all max- 
imal geometrically consistent match-sets. They're maximal in the sense that for each of 
them no other feature match can be added maintaining feasibility. Geometrically, F m ^ is 
isomorphic to a connected volume in 9£ 2 X SO2 with a cylindrical tube shape. The union of 
the boundaries of these volumes divides U 2 X SO2 into cells, where each cell is isomorphic 
to an equivalence class E k . Figure 6 illustrates several overlapping sets F m ,<j. Note that 
clustering techniques such as the Hough transform seek to find cells in TPS which are 
contained within many of the F m ,<i by using a crude quantization of TPS. 

The main goal then is to determine all maximal geometrically consistent match-sets. 
Bach such match-set is associated with a set of geometrically consistent transformations, 



2 To be precise define F = TPS - (Ji,, F TOil<i> . So for T € F , <p{T) = 






Figure 6: Two views of transformation parameter space. The vertical axis is rotation, 
0,' and the other two are translations u and v. Shown are a number of geometrically 
consistent transformation volumes for nine different feature pairs. Each set of consistent 
transformations is a helical tube, with extent in the vertical direction depending on 6 the 
angle uncertainty, and circular cross section e depending on the positional uncertainty. 
Slices of this space at fixed rotation <j> are shown in figures 11 and 12. 

and thus can be refined to form an hypothesis on the pose of the object in the image. 
We describe a method to enumerate the set {<p(T)\T G TPS} of all maximal geometrically 
consistent match-sets. The algorithm is polynomial in the number of model and image 
features. 



2.3 Topological Analysis of Transform Space 

To introduce the feature matching and hypothesis technique we'll first consider the 
case of simple point features without any associated orientation. Thus the only relevant 
uncertainty bound is the uncertainty in image feature position e. We'll then consider the 
case including orientation. 

A crucial conceptual and algorithmic approach will be to consider the group of trans- 
lations, T, separately from the group of rotations R. Define ^(T^T^) to be the set 
{(m;,dj);T = T t oT <f> E F m . )di }. For a fixed rotation operator T^, the function V(T f ,T^) 
partitions the translation group T where as above, T t = T t ' <=> il>(T t ,T<f,) - ^(Tt'jjfy). 
Let the equivalence classes be 1?£, and define the function *(T^) = {!?£} to be the set of 
equivalence classes of translations T t for fixed rotation Tj,. The function *(T^) partitions 
the set R of rotation operators, where T<j, = Tj <=> \P(T^) = #(T<j,'). Denote these equiv- 
alence classes by Ef. This partition of R is crucial to the matching approach we develop 
here. 



2.3.1 Feasible Translations 

Consider an arbitrary but fixed angle of rotation <f> , and a single feature match (m,d). 
If the model feature's position is rotated by an angle <f> , the translation exactly aligning 
the two points is given by 

Uj = Pd, -e^°p m ,. 

But we only require a transformation approximately aligning the two features. We call any 
translation T t for which Ip^. - T t [p m J| < c a feasible translation. Define Cij(T^) be the set 
of feasible translations for feature match (m^dj) after the rotation T<f> has been applied, 
thus 

CtfCfy) = {Uj + z : z E C, \z\ < e, Uj = Pd, - ^[p m .]}. 

For each feature match, the set of feasible translations Cjj(T^) for any fixed rotation Tj, is 
a disc in translation space; call this a match-disc. In this case, the regions of translation 
space formed when two or more discs have a common intersection are the equivalence classes 
of translation, E^. Said another way, if we consider the fixed rotation <f>o but consider all 
possible matches {(m,, d 3 )} — {m^} x {d 3 } and the set of associated match-discs {C.^T^)}, 
the function ^(TtyT^) partitions the translation space into equivalence classes E\ where 
T t = T t ' when for all («, j), T t G Cy(T^) <=* T t ' G C^T*,). 

2.3.2 Topological Boundaries 

We have been considering a fixed rotation <f>o. We now consider what happens to 
geometrically consistent match sets as the rotation <f> varies. We showed that the center 
of each match-disc Cy is the point in T given by t tJ - = p^ - e t<f> p mi . Thus as <j> varies, 
the match disc follows a circular path of radius |p m | centered at the point p^ in the u-v 
plane. See figure 7. As the rotation <j> varies the topology of intersections of match discs 
Cjj(jfy) in T changes; that is, the set *(T^) = {E^} changes. This partitions the rotation 
group R. This is the key insight of the method. By partitioning R, each equivalence 
class E% is associated with a partition of T, together these form the partition of the whole 
transformation group TPS yielding the set of geometrically consistent match sets 

T 

Figures 8 and 9 show examples of different intersection topologies as the rotation <j> is varied. 
Note that as yet we have not required a match set to be a one-to-one correspondence between 
model and image features. We will consider this below. 

2.3.3 Determining Topological Boundaries 

The function *(T^) = {Eft changes value when the topology of feasible match-disc 
intersections changes with variations in the rotation angle <f>. There are three different 
events associated with this change. The first case is when two match-discs are intersecting 
and then move such that they are just tangent, and then completely separate. The (f> at 
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Figure 7: Snapshots of position space (the coordinates in which we represent feature 
positions) and translation space (the coordinates in which we represent relative translations) 
for three different rotations applied to the model feature. The match-circle in translation 
space is dotted when the difference in orientation between the image and model feature is 
greater than the uncertainty bound in orientation. 






Figure 8: Three snapshots of translation space as match circles follow their circular orbits 
showing the case of a type I critical rotation due to a three-way intersection between three 
match circles. The solid circles are the match circles, the dotted circles are their orbits as 
(j> varies. This illustrates how the topology of intersections changes as <f> varies. 






Figure 9: Three snapshots of translation space as match circles follow their circular orbits 
showing the case of a type I critical rotation due to tangency condition between two match 
circles. This illustrates how the topology of intersections changes as <j> varies. 



which tangency occurs marks a topological boundary. The second case is when three match- 
discs are mutually intersecting, then move such that they still intersect pairwise but have 
no common intersection. The <f> at which the three boundary circles intersect at a single 
point marks a topological boundary. Finally, the ^'s at which two match-discs exactly 
coincide marks a topological boundary. These three cases are illustrated in figures 8, 9 and 
10. The fact that these are the only cases we need consider is discussed in the appendix 
in section 9. Considering all pairs and all triples of match-discs, we can compute those 0's 
marking topological boundaries in the rotation space, by analyzing all occurrences of the 
above three cases. Call these ^'s Type I critical rotations. An example of a type I critical 
rotation is shown in figure 11. This figure is a slice of constant <j> of the transformation 
space shown in figure 6. 

When we include angle constraints, that is we require that |0j - T^[^ m ]| < 6, we in- 
troduce new topological boundaries. Figure 6 shows two views of TPS including the 
helical tubes forming geometrically consistent transformation sets for nine different feature 




Figure 10: Three snapshots of translation space as match circles follow their circular orbits 
showing the case of a type I critical rotation due to a coincidence condition between two 
match circles. 






Figure 11: A sequence of match circles in translation space showing a type I critical 
rotation. These are slices of the same space as shown in figure 6. Type I critical rotations 
are when, as match circles orbit as functions of <f>, their topology of intersection changes. 
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Figure 12: A sequence of rotations of match circles in translation space showing a type 
II critical point. These are slices of the same space as shown in figure 6. The topology 
of intersection of match-circles changes when a match-circle appears or disappears at the 
extrema of feasible orientation difference between the matched features. 

matches. The <£'s marking the ends of a given tube are also rotations where the topology 
of intersections of match-discs changes. Intuitively, a match-disc is present in T only in a 
rotation range in which the angle constraints are satisfied. The topology of T, and thus 
$(T^), changes at the edge of the feasible rotation range for any feature match, where 
essentially a match-disc vanishes or appears. So for each feature match, the end points of 
the range feasible rotations also mark topological boundaries. Call these #'s Type II critical 
rotations. An example of a type II critical rotation is shown in figure 12. 

2.3.4 Enumerating Match Sets 

The critical rotations partition rotation space forming equivalence classes {E k } associ- 
ated with constant topology of translation space. Each class E* = [<£&,*, <f> e ,k] is a range of 
rotation angles where <f>b,k and ^e,fc are critical rotations. To enumerate the match sets we 
use the fact that each class E% is associated with the set of translational equivalence classes 
¥(T^) = {E£} for any T<f> € E%. Consider one such equivalence class [&,&]• Pick any <f> m 
where <f>b < <l>m < <f>e- At this rotation there is a particular configuration of match-discs in 
translation space. We'll call the circle forming the boundary of a match-disc a match circle. 
As we stated above, the union of the match circles divides the u-v plane into cells, each of 
which forms a translational equivalence class E l k . At a rotation <£ m as chosen above, if they 
intersect at all two circles intersect at exactly two points. By computing all these intersec- 
tion points we have a set of points such that every cell in the u-v plane has at least two such 
points falling in its boundary. This gives us a point in each cell. Because <f> b < <j> m < <f> e , 
at <j> m there are no 3- way match-circle intersections, circles just tangent to one-another, or 
coincident circles, and thus each match circle intersection point falls in the interior of all 
the match-circles it intersects except for the two match-circles from which it was computed, 
where it lies on the boundaries. The top and bottom frames of figure 11 illustrate generic 
configurations, while the middle frame shows a critical rotation delineating two equivalence 
classes. 
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We can enumerate all maximal geometrically consistent match-sets over all transforma- 
tions, computing 

U W(T t ,T,)}, 

T+jTt 

as follows. For each equivalence class [<&,,,, <j> e ,i] choose <t> m ,i so <fo,t < <t> m ,i < <f>e,ii apply 
the rotation <f> m ,i, and compute all match-circle intersection points. For each intersection 
point, determine the match-discs it falls within, building up a match set, and collect all 
such sets for all intersection points and all rotational equivalence classes. Albeit inefficient, 
this procedure can be used to enumerate the set of all maximal match sets. 

2.4 Summary of the Idea 

We have a set of model features and a set of image features, and the set of all possible 
pairs between them. We wish to find the subsets of pairs for which there exists some trans- 
formation on the model features simultaneously aligning them with their matched image 
feature, to within the uncertainty bounds on position and orientation. These we called 
geometrically consistent match sets. For any given rotation applied to the model features, 
and for each feature match, there is a disc in translation space of feasible translations. 
The intersections of these feasible match discs imply geometrically consistent match sets. 
We can partition the set of rotation angles <j> into ranges in which the set of geometrically 
consistent match-sets does not change as <f> varies within the range given by the partition. 
Thus there is a finite set of rotation angles <j> which need be considered. For each such 
rotation, there is a finite set of geometrically consistent match-sets which we can compute 
from the individual feasible matches. 

3 The Computation 

The previous section outlined the basic idea of topological analysis of transform param- 
eter space. This section describes the idea from a computational standpoint. We'll first 
describe the computational components needed, and then describe an algorithm for feature 
matching. 

3.1 The Basic Components 

3.1.1 Partitioning Rotation Space 

We now give the details of the computation of critical rotations associated with change 
in the topology of feasible translations in translation space. There are three configurations 
of match-discs we need to consider: when two match-circles are just tangent, when three 
match circles intersect at a point yet have no common intersection of their interiors, and 
when two match-circles exactly coincide. 

Two match- circles intersect at only one point when their centers t t and t 2 are exactly 
a distance 2e apart. In this case the centers satisfy 

(ti - t 2 )(t; - 1;) = 4c 2 
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Figure 13: When three match circles intersect at a point, their centers fall on a particular 
circle of radius e, centered at some point to- 

where t* denotes the complex conjugate oft. Thus we seek the roots of the equation 

titj + t 2 t; - (tjt 2 + titj) - 4c 2 = o. 

Recall that t = p<* — e^Pm. Thus the above equation can be viewed as a function of c*^. 
Multiplying the above equation by e** will not change its roots, and inspection shows that 
the resulting equation is quadratic in e*^ and therefore there are no more than two values 
of <j> which satisfy it. If no unit magnitude complex number is a root then the circles do 
not intersect. Also, any <f> will satisfy it when all the coefficients are zero. The appendix in 
section 9 gives the solution to the above equation for <f>. There are mn circles so there at 
most 2(" 2 n ) such critical rotations. 

Given three match circles whose centers are described by the complex quantities ti(<£), 
t 2 (0), and t3($), we seek those <f> for which they all intersect at a point. Note that this is 
equivalent to seeking those pairs (<Mo) *° r which t 1? t 2 , and t 3 fall on a circle of radius e 
centered at some point given by to- See figure 13. 

This case is described by the following system, which we must solve for <f> and t = 
t*o + ivo- 

(tiW)-to)(t;(«-t;) = e 2 
(t2M-to)(t;w-t;) = 6 2 
(ts(*) - to)(tj(*) - tj) = e 2 

As is derived in the appendix in section 9, the values of <f> which satisfy this system satisfy 
the equation 

(t2 _ tl )(t 3 - ta)(t; - t;)(t; - t;)(t; - t;)(t 3 - 1 2 ) + 
* 2 [(t; - t;)(t 3 - 1 2 ) - (t 2 - toft; - 1;)] 2 = o 
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Multiplying the above equation by e 3 *^ will not change its roots, and inspection shows the 
result is a 6 th degree polynomial in e**. There are at most six distinct solutions to this, 
unless any <f> is a solution when all the coefficients are zero. Note that since the above 
equation is purely real, the coefficients to powers of e %< ^ and e - *^, respectively, are complex 
conjugates; thus, the equation is a V d degree trigonometric polynomial in cos(^) and sin(<£). 
There are (™ n ) triples of circles, so there are at most 6(™ n ) such critical rotations. 

The third case where two match-circles exactly coincide is easily computed by equating 
ti = t 2 = Pdi - e i4> P mi - Pd 2 - e**p m2 and solving for <f>. There are at most (™ n ) 
such critical rotations. Finally, when we include constraints on relative orientation between 
matched features, there is a critical rotation where the difference in orientation of the model 
and image features is exactly 6, there are two such rotations for each feature match, thus 
exactly 2mn such critical rotations. 

3.1.2 Constructing Match Sets 

The critical rotations partition rotation space, and any equivalence class corresponds 
to a particular intersection topology of match-discs in translation space. As before, let an 
equivalence class be given by [#&, <j> e ] where these end points are adjacent critical rotations, 
and choose <f> m so that fa < (f> m < <f> e . By the construction of the critical rotations, all the 
intersections of match circles at <f> m will be simply the two intersection points of pairs of 
match- circles. Given the locations of the centers of two match circles, ti and t 2 , the two 
points of their intersection are given by 



±*(ti-t 2 )/3 



where 

= 



1 



(ti - t 3 )(t; - tj) 4' 

Each such point falls within some number of match-discs, and on the boundary of exactly 
two, those two that generated the intersection point. Thus each circle intersection point 
borders four different regions in translation space. See figure 14. To construct the different 
match sets associated with a given intersection point, we search through all the match-discs 
the point falls within, and then add either, neither, or both of the generating match circles 
to this set. Note that in enumerating the match-sets there is some duplication when two 
or more intersection points border the same cell in translation space. 

We can now show a loose upper bound on the complexity of feature matching in this 
case. For each of the 0(m 3 n 3 ) critical rotations there are at most 2m 2 n 2 intersection 
points. For each circle intersection point we must potentially look for containment in at 
most mn - 2 match circles in order to construct the match sets associated with it. So a 
crude upper bound on the complexity of constructing all maximal geometrically consistent 
match sets is 0(m 3 n 3 ) • 2rn 2 n 2 • (mn - 2) = 0(m 6 n 6 ). This gives the basic idea, however 
there are more efficient ways to construct match sets which we will use in the algorithm 
development below. 
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Figure 14: Each intersection point between two match circles borders four different regions 
of translation space, and is thus associated with four different match sets. 

3.1.3 Evaluating Match Sets 

Ultimately each match set forms a hypothesis on object pose which needs to be evaluated 
and verified in some way. As an initial step to this process we must determine the order in 
which the hypotheses will be verified, in case we don't verify them all. We seek to assign 
some value to each hypothesis. The simplest and most intuitive value is the number of 
matches in a match set. This value in a sense accounts for how well the hypothesis explains 
the observed image data. Note that as yet we have not required a match-set to be one-to- 
one. It is possible that different feature matches involving the same model or image features 
will be in the same geometrically consistent match set. In the case of point features it is 
likely that we require that features be matched one-to-one. Thus a straightforward count 
of the cardinality of a match set would be misleading. 

One reasonable approach to this problem is to construct a one-to-one match set of 
maximal cardinality by eliminating some feature matches from the initial match sets. This 
is easy to do[9]. Construct one bipartite graph for each different match set. Let one set 
of nodes represent all the image features, and let the other set of nodes represent all the 
model features. There is an edge between two nodes if the given feature match is in the 
match set. The graph is bipartite because no edge connects two model feature nodes, or 
two image feature nodes. By finding a maximal bipartite matching we have a maximal 
one-to-one matching of matches from the match set. The bipartite matching problem takes 
0(N 2 - 5 ) [10], where N is the number of nodes in the bipartite graph. Note that there may 
be many one-to-one match subsets of maximal cardinality, and we can only find one with 
the graph approach. However we only really seek a way to evaluate an hypothesis. Any 
match subset implies approximately the same range of transformations as its containing 
match set. The cardinality of a maximal one-to-one match set is the value assigned to each 
hypothesis. 
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3.1.4 Some Approximations 

Thus far there is considerable complexity in the method as developed. First, to find 
all critical rotations takes time 0(m 3 n 3 ) because we may have to consider all triples of 
match-discs. One reasonable approximation we can make is to only consider pairs of match 
discs instead. The idea is the following. We are looking for places where, say, N match 
circles have a non-empty intersection. By considering pairwise circle intersections, which 
we can find by computing type I critical rotations due to tangency between two circles, we 
can find a range of rotation where each of the N circles intersects the other N - 1 circles 
pairwise. Although in this range we cannot be sure that the intersection of all N match 
discs is non-empty for these rotations, we will be very close to the rotation where this is 
true. Intuitively this is because the errors and thus the match-disc's positions are random 
variables and it is unlikely that the N discs will be randomly arranged so that they all 
intersect pairwise yet there is no region where all or most of them intersect. In section 4 
we show that the empirical data support this hypothesis. 

Second, when evaluating match-sets we need not perform a full bipartite matching. 
Instead we can form an upper bound on the size of the maximal match subset as follows. 
Determine the minimum of the number of distinct image features and the number of distinct 
model features appearing in a match set. This quantity is the upper bound we seek, 
eliminating all duplicate features. It can be computed in time linear in the number of 
matches in the match set. See [9] for a detailed discussion of this idea. 
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3.2 An Algorithm for Hypothesis Construction 

This section details an algorithm for constructing all maximal geometrically consistent 
match-sets. We use the approximation that only pairs of match-disks are considered in 
detennining critical rotations. In the actual implementation we make the following obser- 
vation: Rather than choose an intermediate angle <f> m inside an equivalence class at which 
to evaluate all intersecting match circles we only look at regions of the u-v plane where 
change occurs. That is, we look at the regions containing the point of tangency of two 
match circles for Type I critical rotations and we look at all circle intersections contained 
with the match circle associated with a Type II critical rotation. The algorithm follows 
these basic steps: 

• Form feature matches, and their corresponding match circles 

• Extremes of valid rotation for each match circle form critical rotations 

• Form pairs of match circles which intersect at some rotation 

• Points of tangency for these pairs form critical rotations 

• Compute and order these critical rotation angles 

• Compute the match- sets geometrically consistent at some initial rotation angle, <f> 

• Step from <f> through critical rotation angles in order 

• At each new critical rotation, determine any change in the match-sets formed 

3.2.1 Intersecting Pairs of Match Circles 

There are exactly mn feature matches, each associated with a match circle in translation 
space whose center is given as a function of rotation angle <f>. If we consider the orbit of 
a match circle as a function of of <f> without considering constraints on valid rotations due 
to orientation we have a circular orbit. Only over the range of rotations of width 26 where 
the orientations of the model and image features are within 8 of one another is the feature 
match feasible. See figure 7. Thus, most match circles will not intersect at any point in 
their orbit, and even fewer within their valid range of rotation angle <f>. 

To form pairs of intersecting match circles, the simplest approach is to consider all 
pairs of match-circles and determine if intersection is possible. There are (™ n ) such pairs, 
requiring 0(m 2 n 2 ) time. By being slightly more careful we can determine which circles 
could ever possibly intersect without necessarily considering all pairs of circles. We can 
bound the region of the u-v plane swept out by each match circle over its range of valid 
rotation by a rectangle, and determine which rectangles intersect. Only match circles whose 
bounding rectangles intersect can intersect one another. See figure 15. The intersections of 
N isothetic 3 rectangles in the plane can be computed in 0(N\gN) + K time[12], where K is 
the number of intersecting rectangles. From these potential pairs the valid intersecting pairs 



3 Isothetic in this case means the sides of the rectangles are all aligned parallel to the coordinate axis 
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Figure 15: The rectangles bounding the area swept out by match circles over their valid 
range of possible rotations. Inside each rectangle are the positions of the match circle at 
each of its extremes in valid rotation. The position at the lower rotation angle is shown 
with a dot in the center of the circle. 

can be determined. Of course, K = 0(N 2 ), but in practice there are many fewer rectangles 
which actually intersect, so this first step can reduce the computation considerably in 
practice. 

3.2.2 Computing Maximal Geometrically Consistent Match Sets 

For all match circles and all intersecting pairs of match circles, the set of critical rotations 
is computed and sorted modulo 2x. We pick an arbitrary rotation angle <f> that is interior 
to some partition, that is, ^o is not in the set of critical rotations. The idea will be to 
compute the geometrically consistent match sets at <f>o, and then step through the set of 
critical rotations noting any changes that occur in the set of geometrically consistent match 
sets. Changes in match sets can only occur at critical rotation points. 

To start we compute the locations of the match circles at the rotation <f> , and the points 
of intersections of match circles. As outlined in section 2.3.1, the union of the match circles 
divide the u-v plane into cells associated with match sets. Each distinct cell has at least two 
circle intersection points in its boundary. From these intersection points we can construct 
the match sets by counting the number of circles containing each intersection point. The 
appendix in section 9 describes an efficient algorithm for determining the set of match- sets 
for this initial, static configuration of match circles. 

We construct and maintain a dynamic data structure containing, for each match-circle, 
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the set of other match-circles currently intersecting it. This table is updated at each critical 
rotation when the match-sets change. For a given circle producing an intersection point, 
these are the only circles that need be checked for containment of the intersection point in 
order to determine the associated match-set. 

We next step through the equivalence classes of rotation space noting new geometrically 
consistent match sets as they appear. There are a few subtleties that must be considered. 
First consider two match-circles which intersect over some rotation range. Ignoring the 
angle constraints there are two type I critical rotations associated with these two match- 
circles. Also, each match circle also has a range [<&><,&;] over which it is valid. The 
intersection of these ranges [^,^ e J n [&,.,&,-] for two different match circles yields the 
range of rotations where these circles are both valid. Note that either of the two type 
I critical rotations may or may not fall inside the range [fa^M n [<f>b P <t>ej]- If a type I 
rotation, 4> c is not in [<f> bi , <j> ei ] n [<j> bj , <f>ej } then it is not a critical rotation we use to construct 
match sets, however we do use it to maintain the dynamic list of currently intersecting 
circles. Therefore, at each type I critical rotation we add or delete circles from the dynamic 
intersection sets as needed, but we also know whether or not any given match circle is within 
its range [<Afc,<kJ for the particular <f> being considered. A second subtlety is that when a 
type II critical rotation occurs, several new match-sets are formed. All circle intersections 
falling within the new match-circle appearing must be considered to determine the new 
match-sets formed. Next we bring this all together in an algorithm. 

3.2.8 Algorithm Summary 

Considering all this we have the following procedure: Starting with the initial intersec- 
tions noted at <f> , step through the sorted critical rotations, updating the dynamic circle 
intersections lists at each Type I critical rotation. If a Type I critical rotation associated 
with circle i and circle j occurs within [<&>;,&<] n [&,-,&,•] t^ en compute the new match set 
implied by determining which circles contain the single point of intersection of the circles i 
and j for the critical rotation. Only the circles intersecting either i or j need to be consid- 
ered. At each Type II critical rotation associated with a circle i, the pairwise intersection 
points of all circles intersecting circle i are computed, and for each a new match-set is 
constructed containing the match for circle i. 

Finally, for each of the match sets constructed we can evaluate them to provide the 
output of the hypothesis module. Each match set was associated with some critical rotation, 
and one of the circle intersection points bounding the cell in translation space associated 
with the match sets provides a feasible translation. In this way with each match set we can 
associate a transformation hypothesis. 

4 Experiments 

We have shown that all maximal geometrically consistent match-sets can be found by 
analysis of transform parameter space, facilitated by partitioning rotation space at critical 
rotations. We have outlined and implemented the full algorithm which constructs match 
sets, as well as an approximate algorithm. The full algorithm computes all critical rotations; 
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the approximate algorithm does not consider the triple intersections described above but 
instead it only considers Type I critical rotations due to a tangency condition between two 
match circles in translation space, and all Type II critical rotations. 

To explore the effectiveness of the approximate algorithm several empirical tests were 
performed. The question we need to answer is how often will a correct hypothesis be 
missed when the approximate algorithm is used. That is, given that there are k of the 
model features present in the image data, how often will one of the hypotheses include all 
k features, and what fraction of these k features will be found otherwise. 

4.1 Simulations 

In order to carefully control all aspects of the problem, simulation experiments were 
conducted in which synthetic model feature sets and data feature sets were constructed. 
Here we'll discuss how these data were constructed and the experiments run. 

4.1.1 The Simulated Model and Image Features 

Synthetic model and image features were constructed in the following way. In all the 
experiments point features were used consisting of a position in the plane and an associated 
orientation. For aH experiments an "image" size of 256 by 256 was assumed; that is to say 
the coordinates of the position of a feature, (x,y), fell in the range x,y £ [-128,128]. 
For any given experimental trial a total of m random points were generated with integer 
coordinates (x, y). To each of these points an independent unit vector of random orientation 
was assigned. This set of oriented feature points formed the model feature set. Examples 
of synthetic model and image features are shown in figure 2. 

To construct simulated image data, a copy of the set of m model features was con- 
structed. From this set, k of the m model features were chosen at random and the remaining 
features deleted from the set to simulate occlusions and missing features. For each remain- 
ing feature its position was perturbed by adding a random vector of length / < (.95)6, and 
its orientation was perturbed by a random angle chosen from [-(.95)£, (.95)6]. Next, an 
arbitrary rotation and translation were applied to this perturbed set. Finally, to simulate 
spurious data features additional random features were added to complete the synthetic 
image feature set. 

4.1.2 The Experiments 

Because the image and model features were constructed synthetically, it was possible 
to determine which matches were correct. Each experimental run determined whether 
one of the hypothesized geometrically consistent match sets contained all k correct feature 
matches among their matches, where k is the number of model features actually appearing 
among the image features. To demonstrate the effectiveness of the complete version of the 
algorithm in which the triple- circle intersection case was considered, the following exper- 
iment was run. m = 15 random model features were generated, from which k = 11 were 
randomly selected, independently perturbed in position and orientation, and added to 15 
other random features. This set of n = 26 data features was then rotated and translated 
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by an arbitrary amount. This formed a model set of 15 features, and a data set of 26 
features, of which 11 corresponded to the model. This process was repeated in 1000 inde- 
pendent trials using e = 8 and 6 = 12 degrees, generating completely new random model 
and data features each trial. When all critical rotations were computed, including triple 
intersections, a correct match-set of size 11 was found in all 1000 trials. 

Because the existence of spurious features only adds to the number and size of match 
sets, but does not affect what fraction of the k model features appear in an hypothesized 
match set, for the following experiments there were no deleted or spurious features. The fol- 
lowing experiments test the degree to which the approximate algorithm actually constructs 
all maximal geometrically consistent match sets. 

4.1.3 The Effect of Model Feature Count 

The first experiment explored the effect of increasing the number, m of model fea- 
tures. For these experiments there were n = m image features. For each value of m € 
{3, 6, 9, 12, 15, 18, 21, 24}, 10, 000 independent, completely random experiments were per- 
formed. That is to say for each value of m, 10, 000 different random model feature sets 
were generated, and from each of these the random image features were constructed by 
applying an independent random perturbation to each model feature, and then applying a 
rigid random rotation and translation to this set, independent from all other trials. The 
results of this experiment are summarized in two figures. Figure 16 shows how often the 
approximate algorithm failed to find a match-set with m correct feature matches in it, i.e. 
a correct hypothesis. Shown, for each value of m, is the fraction of 10, 000 trials where a 
match-set containing m correct matches was not constructed. 

Importantly, out of these 80,000 independent random experiments, a correct hypothesis 
was constructed in all but 7287 trials, of these, in 6970 trials the largest correct match-set 
was of size m - 1; in 313 trials the largest correct match-set was of size m - 2, and in 
only 4 trials was the largest correct match-set of size m - 3. This means that using the 
approximate algorithm, in 99.6% of the 80,000 trials a correct match-set of size morm-1 
was constructed. Figure 17 shows the average over 10,000 trials of the fraction of the model 
correctly matched in the largest correct match-set, for each value of the model size m. 

4.1.4 The Effect of Position Error and Uncertainty 

The second experiment explored the effect of varying the amount of error in position 
which was introduced when constructing synthetic image features sets, and correspondingly 
varying the uncertainty bound assumed in the algorithm. For this experiment a range of 
integer position uncertainty bounds e G {2,4,6,8,10,12,14,16,18,20} was used, and the 
error introduced in the synthetic image features was (.95)e. For each value of e, 1000 in- 
dependent trials were conducted, using 9 model features and 9 synthetic image features. 
Figure 18 shows how often the approximate algorithm failed to find a match-set with 9 
correct feature matches in it, i.e. a correct hypothesis. Shown, for each value of e, is the 
fraction of 1000 trials where a match-set containing 9 correct matches was not constructed. 
Out of 1000 trials, there were 851 trials where a correct match-set of size 9 was not con- 
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Figure 16: For each value of m, the number of model features, shown is the fraction of 10,000 
trials where a match-set containing m correct matches was not found, although one or more existed. 

structed, of which 828 constructed match-sets of size 8, and 23 constructed match-sets of 
size 7. 

4.1.5 The Effect of Orientation Error and Uncertainty 

The third experiment explored the effect of varying the amount of error in orientation 
which was introduced when constructing synthetic image features sets, and correspondingly 
varying the uncertainty bound assumed in the algorithm. For this experiment a range of 
integer orientation uncertainty bounds 6 G {2,4, 6,8, 10, 12, 14, 16, 18, 20} (in degrees) was 
used, and the error introduced in the synthetic image features was (.95)£. For each value 
of 6, 1000 independent trials were conducted, using 9 model features and 9 synthetic image 
features. Figure 19 shows how often the approximate algorithm failed to find a match-set 
with 9 correct feature matches in it, i.e. a correct hypothesis. Shown, for each value of 
6, is the fraction of 1000 trials where a match-set containing 9 correct matches was not 
constructed. Out of 1000 trials, there were 735 trials where a correct match-set of size 9 
was not constructed, of which 720 constructed match-sets of size 8, and 15 constructed 
match-sets of size 7. 

4.2 A Real Example 

Finally, as an example of a real application, and of running the algorithm on a much 
more complicated case, features were derived from real images. For both the model features 
and the image features, the features were derived by starting with a grey-level image of the 
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Figure 17: For each value of m, the numbei of model features, shown is the average over 10,000 
trials, of the fraction of the model features correctly matched in the match set with the largest 
number of correct matches. 

object and a scene, applying an edge detector, and sampling the resulting edge contours 
at regular intervals, forming features out of the sample point positions and the contour 
normal at that point. Of course, there are possibly much better ways to derive point 
features from edge contours, but this method demonstrates our point very well. Figure 
20 shows the image from which the image data were derived. Figure 22 shows the edge 
contours for both the model and the input data image, and figure 21 shows the model and 
image point features derived from the edge contours. Figure 20 shows an example of one of 
the hypotheses constructed. The model contour is transformed according to the hypothesis 
and plotted over the image contours. In this case there were 41 model features and 161 
image features, and the largest match-set had 27 matches. This is the match set we have 
displayed. This illustrates that we can use the method to derive robust pose hypotheses 
from complex images. 

4.3 Discussion of the Experiments 

The most important conclusion we can draw from these experiments is that the approx- 
imate algorithm, which only considers the interaction of pairs of match-circles, works nearly 
as well as the complete algorithm. Over a broad range of model size m, and the uncertainty 
parameters e and 6, all image features were correctly matched to model features, except for 
one or two. When the model is reasonably large, say 10 features or more, almost all the 
model will be correctly matched. 

The experimental results can be explained in terms of the interactions of match- circles. 
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Figure 18: For the case of 9 model features and 9 image features, for each value of the position 
error introduced and the uncertainty assumed, we plot the fraction of 1,000 trials where a match-set 
containing 9 correct matches was not found, although one or more existed. 

In the experimental results shown in figure 16, the fraction of cases with suboptimal per- 
formance rises with the model size m. This makes sense in retrospect. Imagine the u-v 
plane when the correct rotation has been applied. There is a region of intersection of all 
m correct match discs. As we vary <f> and some disc moves off of this region of intersection, 
the more match-circles involved, the more likely it is to cross over the intersection of two 
other circles, rather than crossing only a single circle. In the case shown in figure 18, the 
larger the match-circles, the less likely many of them will actually bound the region of 
maximal intersection, rather than just containing it. Thus as some circle moves off there 
are fewer circle intersections for it to cross. Finally in figure 19 we see that the type II 
critical rotations reduce the need to look for type I critical rotations. The larger the angle 
uncertainty 6, the more circles interact with type I intersections, and thus the more likely 
we will need to consider a three-way intersection. 

5 Extensions: Line Segments as Features 

Many existing recognition approaches use straight line approximations to contours. One 
advantage of this approach is a great reduction in the number of features to process. The 
approach taken in this paper is not limited to point features, but applies equally well to 
features with finite extent such as line segments. First consider the case where the model 
is composed of straight line segments and the image data are oriented point features as 
before. This is closely related to the case where both the model and data are composed of 
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Figure 19: For the case of 9 model features and 9 image features, for each value of the orientation 
error introduced and the uncertainty assumed, we plot the fraction of 1,000 trials where a match-set 
containing 9 correct matches was not found, although one or more existed. 

line segments if we subtract the length of the image segment from the model segment and 
treat the image segment as a point. 

As before we will assume the uncertainty in position and orientation are independent. 
As in the point feature case, for a given feature match only those rotations leaving the 
orientation of the two features within 6 of one another are feasible, where the orientation 
of a line segment is given by a unit perpendicular vector. In the case of position, consider 
the vector from the center of the line segment to the image point. See figure 23. For 
convenience we will define the range of feasible translations to be any translation leaving 
the perpendicular component (relative to the segment) of this vector less than €, and the 
tangential component less than | + e where I is the length of the model line segment. Thus 
the range of feasible translations, for any particular rotation, is a rectangular box. 

Let p m represent the midpoint of a model line segment. The center of a match-rectangle 
in translation space is given as before by 

u + iv = t =Pd - e t(f> Pm 

where p<£ is the position of the image point. The match rectangle can be constructed by 
centering a rectangle around t whose sides are parallel and perpendicular to the orientation 
of the rotated model segment, e*^0 m . 

As we vary the rotation applied to a model segment, the rectangle of feasible translations 
rotates around the point p^ exactly as in the case described earlier. If we consider the 
translation space rectangles for all feature matches at any fixed rotation <j>, the intersection 
topology of the rectangles defines geometrically consistent match sets as before. Again the 
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Figure 20: A real example. On the left is displayed the leading hypothesis. The model contour 
has been transformed and plotted over the image contours, indicating the pose hypothesized. On 
the right is the input image containing the object of interest. 

idea is to partition the space of rotations into ranges within which the intersection topology 
of translation space is unchanged. As in the case with match- circles, the topology changes 
whenever two rectangles begin to intersect, and when three rectangles intersect at a single 
point, but have no common intersection. See figure 24. 

Two rectangles begin to intersect when one of the vertices of one lies on a line segment 
of the other. We'll parameterize a line by a perpendicular unit vector n, and perpendicular 
distance p from the origin to the line. The orientation n is given by n = — sin + i cos 0, 
where is the positive angle from the x axis to the line. The equation for a line is then 
x • n = p in vector notation, or xn* + x*n = 2p in our complex number representation. 
The equation (x - x )n * + (x* - xo*)n = 2/> represents a line of the specified orientation 
whose perpendicular distance to the point xq is po. The equation (x — xo)e~*^no* + (x* — 
x 0*) € *^ n = 2/>o represents the same line rotated about xo by an angle <f>. In this way 
we can represent the position of a match-rectangle translation space by parameterizing its 
component line segments. 

Two rectangles intersect in the manner of interest when a vertex of one rectangle lies 
in a line segment of another. Let the position of a vertex in translation space be given by 

(v-p^Je^ + pd,. 

where p^ is the center of rotation of the match-rectangle in translation space, and v is the 
position in the u-v plane of a vertex when <f> = 0. Let the infinite line containing a line 
segment of a match-rectangle be given as above by 

(x - p«)e-*nj + (** ~ P**)e*ni = 2 Pi 
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Figure 21: Model features on the left, image features on the right. These features were derived 
from real images as shown in figure 20 

the point lies on the line when it satisfies the equation for the line so we substitute the first 
equation for x in the second equation. Let vj = (v — Pd,). Substituting for x we get 

foe* + p dj - P4)e-*V + foe"* + p dj * - p**)e*ni = 2 Pi 

Note that all terms occur as complex conjugate pairs, so this equation is a real equation of 
the form a cos <f>— (3 sin ^ = £. This can be solved by the equations <f> = ± cos~ 1 ( . 6 )+V> 

where rj> = tan -1 = £-. Once we have solved for the <f> where the vertex lies in the infinite 
line containing the line segment, we must check if it actually falls in the line segment. For 
a pair of match-rectangles, this check is done for all pairs of vertices from one rectangle 
to line segments of the other. We must also check that at the resulting ^ the interior of 
the two rectangles have no common intersection, otherwise this is not a critical rotation in 
rotation space for these two rectangles. 

In the case of a three-way rectangle intersection we solve for the intersection at a common 
point for three lines. Let the three equations be for i = 1,2, 3 

(x - P4i )e-*ii|* + (x* - Pd /)e* ni = 2 Pi . 



We're solving for some x satisfying these equations for all 3 lines. Let y = xe~* giving 
i + y*nj - p^*e*ni = 2/>;. The above system is three equations in three 



y»r - P*e *» 



unknowns, two components of x and <f>. Using gaussian elimination, eliminate y and y* 
from the equations. Solve the resulting quadratic for e*, substitute back into the original 
equations and solve for x. For each triple of match rectangles, all triples of line segments, 
one from each rectangle, must be considered. Finally, each intersection point solved for 
must lie inside each line segment, not just on its infinite extension; also at the resulting <f>, 
the interiors of the rectangles must have no common intersection. 
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Figure 22: The edge contours derived from the model image and the input data image 

These two cases serve to find all critical rotations in rotation space where topological 
changes occur in translation space. For each such critical rotation point, the intersection 
topology of rectangles in translation space must be examined to find the individual cells, 
each associated with a different match cell. These are found by finding the intersection 
points of all the line segments making up the rectangles. The rest of the procedure is 
analogous to the case of point features and match- circles. 

It would be interesting to extend this idea to uncertainty cross sections of arbitrary 
convex polygons or curves. It may also be possible to have a multi-tiered uncertainty 
region, that is, cross sections that consist of several "concentric" convex closed curves, 
possibly weighted according to their likelihood. This would be a discrete approximation to 
a continuous uncertainty probability function. 

6 Related Work 

There has been a great deal of work on object identification and localization, particularly 
in the domain of 2D objects and sensory data. Some examples include [11][4][2][8][6]. The 
work most relevant to this paper is that which relies on determining the correspondences 
between local geometrical features derived from both the model and sensory data. Among 
these we consider here those approaches which explicitly account for error in the sensory 
data. 

For the purposes of comparison let's state the important points about our approach. 
We utilize local geometric features characterized by a position and an orientation. We 
assume that there is uncertainty in the measured geometry of a feature derived from the 
sensory data. We further assume this uncertainty can be bounded. By analysis of the 
space of transformation parameters, we can construct all maximal geometrically consistent 
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Figure 23: The position uncertainty for a line segment feature extends a distance e in the direction 
perpendicular to the measured segment, and a distance ^ + e from the measured center point in the 
direction parallel to the line segment. Above is shown the uncertainty in position space, below is 
shown the region of feasible translations in translation space. 

match-sets in polynomial time 4 . Each match-set forms a globally consistent interpretation 
of the sensory data in terms of the model features. Note that without any approximation, 
these geometrically consistent match-sets can be found in time polynomial in the number 
of model and image features. The asymptotic runtime of the algorithm is independent of 
the geometry of any particular set of model or image features. 

Grimson and Lozano-Perez[8] have developed a recognition system for both 2D and 
3D objects they call RAF. The features they use are oriented line segments in the 2D 
case. They assumed independent bounds on positional and orientational uncertainty. In 
RAF the space of feasible sets of model and image feature pairs is explored sequentially, 
formulated as a tree search. Each path through the tree from root to leaf forms an element 
of the power set 2^ mi ^ x ^ d ^ of feature pairs, where each node in the tree corresponds to 
a particular feature pair. Large sections of the tree can be pruned away by considering, 
for each pair of feature matches, the intersection of their sets of geometrically consistent 
transformations. If any pair of matches is not a geometrically consistent match set, then 
the entire path rooted there can be ignored. 

Empirically, the algorithm is quite robust, accurately finding objects in cluttered scenes 
given inaccurate sensory data. However because the algorithm is inherently exponential in 
the number of image features, several heuristics are used to make it tractable in practice. 
There are two main heuristics employed. One involves grouping feature matches into subsets 
via a Hough Transform form of parameter hashing and exploring the tree restricted to 
some of these subsets. The other involves an early cutoff of the tree search when a set of 
pairwise- consistent matches that is deemed good enough is encountered. The use of these 
two heuristics means that the entire space of match-sets is not explored. Finally, when a 



4 This assumes that the solutions to the sixth-degree polynomials is a constant time operation. However, 
as noted earlier, the approximate algorithm is very good, and the required quantities can be computed 
analytically. 
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Figure 24: The events associated with topological change in translation space are two-way and 
three way intersections of rectangles as shown here. 

pairwise geometrically consistent match-set is constructed an averaging method is used to 
derive a transformation from the set of feature matches. This is then applied to determine if 
in fact most feature matches are approximately aligned. Note that this does not determine 
if there exists a transformation simultaneously aligning all matches modulo uncertainty, i.e. 
that its a geometrically consistent match-set in our terminology. One difficulty with this 
search technique is that it is difficult to determine whether or not there are more instances 
of the object to be found after the first few are found. In fact if there are no instances it 
takes considerable search to answer negatively. 

The principle advantages of our work over RAF are that it is worst case polynomial in 
the number of features, that it finds all maximal geometrically consistent match sets, and 
that these are by construction globally consistent. In the full case there is no approximation 
involved: all feasible matches within the uncertainty bounds are found. In particular, the 
hypothesis step hypothesizes all possible instances whether or not the object is present. In 
fact, if no object is present the computation is easier because in this case fewer match-circles 
will have common intersections. Note that one major difference in the two approaches is that 
RAF uses the more practical line segment features, while our system uses point features. 
In section 5 we outline how the approach can be generalized to line segment features, thus 
making the problems solved by the two systems largely equivalent. 

Ellis[7] considers a special case of the method we discuss here. He assumes a model com- 
posed of line-segment features and data consisting of oriented point features. He assumes 
an uncertainty in position of magnitude e, and an independent uncertainty in orientation 
of magnitude 6 for the data features. Given a set of corresponding model and image fea- 
ture pairs and uncertainty bounds he shows how to find the range of rotations on the data 
features which leaves them within e of their paired model feature (or really within e of the 
infinite line containing the line segment) and the difference in orientation of paired features 
within S of one another. Then, given some rotation of the data features, he shows how 
to determine the range of translations on the data features allowed within the uncertainty 
in position. 
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The important part of this work is that he defines a model of uncertainty, and show how 
to localize an object with careful attention to the uncertainty. Ellis is considering a different 
problem where the correspondences are already known, which in a sense is a special case of 
the general matching problem. Using our approach to the analysis of transformation space 
we could accomplish the same task of localizing the object to within uncertainty bounds 
given the true correspondences. We would simply find the range of rotation <j> within which 
all correct match- circles intersect in a common region. 

In the field of computational geometry, Alt et.al.[l] describe a method for computing 
approximate congruence of two point sets of matched cardinality, allowing rigid transfor- 
mations of the plane. This means given two points sets, A and 2?, of equal cardinality, 
find if there exists a one-to-one correspondence between them such that there exists a rigid 
transform of the points of set B bringing them within the e neighborhood of their matched 
point from set A. They solve a very similar problem to ours. We might consider A to be 
image features, and B to be model features. They don't consider the case where A contains 
a only subset of B, as well as additional, unrelated points. They also determine only if at 
least one approximate congruence exists, instead of finding all approximate congruences of 
all sizes, as we do. They also do not consider angle uncertainty constraints as we do. It 
seems, however, that the modifications to their algorithm required to handle these cases 
is small. Thus the algorithm they outline could very likely be used to solve our matching 
problem in the case where angle constraints are not used. Their approach differs from 
ours in that they analyze the image space, instead of the transformation space. There are 
two possible advantages of our approach over theirs. First, it is clear how to extend our 
approach to extended features and polygonal uncertainty regions. Second, our method can 
be extended to higher dimensional transformations, such as including scale, while it is not 
clear how to extend their approach in these cases. 

Baird[3] describes a method of matching features under uncertainty based upon linear 
programming. He assumes bounds on position uncertainty of image features, and outlines 
an expected polynomial time algorithm for matching. An important limitation of this 
work is that the size of the model and image feature sets is the same, that is, he does not 
allow missing or spurious features. He indicates that considering these cases substantially 
increases the complexity of the matching process. 

The technique of transformation sampling[5] was the developmental ancestor to the 
analytic approach developed here. Rather than determine analytically all different sets of 
transformations, the space of transformations was sampled in hopes that a sample point 
would fall in each distinct feasible region of transformation space. The approach described 
here is also introduced briefly in [5]. 

The Hough transform clustering techniques[13] are really a crude approximation to our 
analytic approach. Again the idea is to determine where sets of feasible transformations 
intersect in transform space. But a coarse quantization is utilized to find such intersections, 
leading to considerable inaccuracy. 
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7 Conclusions 

In summary we have shown a feature matching and pose hypothesis technique that 
requires time polynomial in the number of features, and is provably correct and complete 
for a certain class of local geometric features and precise models of geometric uncertainty. 

We defined a maximal geometrically consistent match set as a set of model and image 
feature pairs for which there is some transformation which aligns the features in each match 
to within the bounds on geometric uncertainty, and such that at that transformation it is 
not possible to add more feature matches to the set and maintain feasibility. We showed that 
it is possible to compute all such geometrically consistent match-sets in time polynomial in 
the number of features. 

One important implication is that a complete and precise solution to the problem of 
feature matching in the presence of uncertainty is of polynomial complexity. If the feature 
matching were exact, i.e. there where no error then there are simple polynomial algorithms 
to construct geometrically consistent match sets. However if only approximate matching 
is required then these simple algorithms cannot be proven to find all possible match-sets. 
Existing approaches which explicitly deal with geometric uncertainty as carefully as we do 
have been of worst case exponential complexity. Thus our analysis provides an important 
theoretical understanding of the matching problem. 

This approach to matching is intended as a crucial part of the hypothesis stage of a 
recognition system. We have not addressed the fact that given a geometrically consistent 
match set, the actual transformation implied can be refined by optimizing some difference 
metric over the set of matches. Once high quality hypotheses are generated, the final step 
is to verify them using possibly richer representations. 

While the 2D matching case described here is interesting, of greater interest is gaining 
a more thorough understanding of the matching problem with geometric uncertainty in 
higher dimensional problems such as matching 3D models to 2D image data, or 3D models 
to 3D range data. Our approach has important implications in these cases. Our analysis of 
transform space events goes beyond this simple case and can also be applied to these higher 
dimensional problems. An approach based on the ideas in this paper has been developed 
for the 2D case of rotation, translation including scale variation, and work is in progress 
considering matching 3D models to 2D and 3D data. 

8 Acknowledgments 

Thanks to Tao Alter, Eric Grimson, Berthold Horn, David Jacobs, and Tomas Lozano- 
Perez for reading earlier drafts and helpful comments and discussions. 

9 Appendix 

9.1 Solving three-way intersections 

Given three match circles whose centers are described by the complex quantities ti(^), 
t 2 (<f>), and t 3 (<£), we seek those <j> for which they all intersect at a point. Note that this is 
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equivalent to seeking those pairs (<M ) for which t l5 t 2 , and t 3 fall on a circle of radius e 
centered at some point given by t . See figure 13. This case is described by the following 
system, which we must solve for 4> and to = tio + * v o« 



(tiW - t )(t;w) - ts) 
(t 3 W) - t )(t;w) - tj) 
(ts(^) - t )(t;(« - tj) 



= € 



= € 



= e 



By expanding then subtracting any two of the above equations we get 

ut- - mj - [t (t; - tj) + t* (ti - tj)] = o 

From this we construct the linear system: 



(tS-tj) (t 2 -ti) 

(*3 - *5) (*3-t 2 ) 



to 

tS 



t 2 t 2 — tltj 
t 3 to — t 2 t 2 



The determinant of the above matrix is (t$ - tj)(t 3 - t 2 ) - (t 2 - ti)(t£ - t$) which is 
non-zero when ti, t 2 , and t 3 are not colinear, thus the solution of this system is the unique 
circle of some radius on which ti, t 2 , and t 3 fall. The case were the determinant is zero 
results when the three points lie on a circle of infinite radius. We can solve this for t using 
Cramer's rule: 



to = 



(t 2 tS - *i*i) (*2-ti) 
(t 3 tj - t 2 tj) (t 3 -t 2 ) 



(*S - *5) (t 2 -ti) 
(tS - *S) (ts-t 2 ) 



which yields 



to = 



(t a ts - tit;)(t 3 - t a ) - (t 3 t; - t a t;)(t a - ti) 

(t; - tx*)(t 3 - t 2 ) - (t* - q)(t 2 - t x ) 



Note that the solution for tj is consistent with the solution for t . We seek <f> and t such 
that this circle has radius e. So, substitute t into 



Now 



and finally 



(ti - 1 ) 



(ti(« - t )(tl(« - tj) = e 2 

(ta-tiKts-tQftS-t;) 

(ts-tjKts-taJ-tta-^Ktj-t;) 



(ta - ti)(t 3 - tO(t; - t;)(t; - ti)(t; - t;)(t 3 - 1 2 ) + 

e 2 [(t; - t;)(t 3 - t a ) - (t 2 - tiXtJ - t* 2 )} 2 = 
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9.2 Topological Changes in Translation Space 

We claimed there were only three events which characterize the way in which the topol- 
ogy of intersection of match circles change as circles follow their circular orbits as functions 
of the rotation, <f>, applied to the model features. The reason for this is fairly simple. Each 
distinct cell in translation space is bounded by one or more circles, and their intersection 
points. To change the topology, one cell must move into or out of another. To do so their 
boundaries must cross. When this happens either two, three, or more circles intersect at 
a single point. But if more than three intersect at a single point, any subset of three will 
do to solve the equations. There are also limiting cases where the cells of interest are just 
points, but these are handled as well. The case where two circle are coincident is the other 
special case which must be considered for topological effect. 

9.3 Complex Numbers and 2D vectors 

We find it convenient mathematically to consider points in the plane as complex num- 
bers. With this representation, rotation about the origin by an angle <£ corresponds to 
multiplication by the complex exponential e**. If a = (a x ,Oy) T and 6 = (6 x ,6 y ) , as 
complex numbers we have 

a <-> a = o x + icLy 



We also have 



b «-» b = b x + ib y . 

_ r ab* + a*b 
a . h= _ 

- z(ab» _ a *b) 
axb= _ 

where x* denotes the complex conjugate of x. 

9.4 Constructing The Initial Match-Sets 

A first step in the algorithm is to determine the intersection topology of the match- 
circles for some initial rotation <f>o, and construct the associated match-sets. This can 
be done as follows. Compute the intersection points of all the circles, by first using an 
algorithm to intersect their bounding squares in O(NlgN) + K time for N circles with K 
intersection points. For each circle we sort its intersection points with other circles by angle. 
Because the circles are the same size, if the interior of two circles intersects, then the circles 
must intersect (excluding the special case of concentricity). Thus if we step through the 
intersection points in order of angle keeping track of when we enter or leave other circles, 
in two passes around each circle we can construct the set of match-sets associated with its 
intersection points. 



34 



References 

[I] H. Alt, K. Mehlhorn, H. Wagener, and E. Welzl. Congruence, similarity, and symme- 
tries of geometric objects. In Discrete and Computational Geometry, volume 3, pages 
237-256. Springer- Verlag, New York, 1988. 

[2] N. Ayache and 0. D. Faugeras. Hyper: A new approach for the recognition and 
positioning of two-dimensional objecs. IEEE Transactions on Pattern Analysis and 
Machine Intelligence, 8(l):44r-54, 1986. 

[3] Henry S. Baird. Model-Based Image Matching Using Location. MIT Press, Cambridge, 
MA, 1985. 

[4] R.C. Bolles and R.A. Cain. Recognizing and locating partially visible objects: The 
local-feature-focus method. International Journal of Robotics Research, l(3):57-82, 
1982. 

[5] Todd Anthony Cass. Parallel computation in model-based recognition. Master's thesis, 
Massachusetts Institute of Technology, 1988. 

[6] Todd Anthony Cass. A robust implementation of 2d model-based recognition. In 
Proceedings IEEE Conf. on Computer Vision and Pattern Recognition, Ann Arbor, 
Michigan, 1988. 

[7] R. E. Ellis. Uncertainty estimates for polyhedral object recognition. In Proceedings of 
IEEE Conference on Robotics and Automation, pages 348-353, 1989. 

[8] W. Eric L. Grimson and Tomas Lozano-Perez. Localizing overlapping parts by search- 
ing the interpretation tree. IEEE Transactions on Pattern Analysis and Machine 
Intelligence, 9(4):469-482, July 1987. 

[9] Daniel P. Huttenlocher and Todd Anthony Cass. Measuring the quality of hypotheses 
in model-based recognition. To Be Published, 1990. 

[10] Christos H. Papadimitriou and Kenneth Steiglitz. Cominatorial Optimization: Algo- 
rithms and Complexity. Prentice-Hall, Englewood Cliffs, New Jersey, 1982. 

[II] W. A. Perkins. A model-based vision system for industrial parts. IEEE Transactions 
on Computers, 27(2):126-143, 1978. 

[12] E. P. Preparata and M. Shamos. Computational Geometry-An Introduction. Springer- 
Verlag, New York, 1985. 

[13] G. Stockman, S. Kopstein, and S. Bennet. Matching images to models for registra- 
tion and object detection via clustering. IEEE Transactions on Pattern Analysis and 
Machine Intelligence, 4(3), May 1982. 



35 



